Practical Pattern Matching

نویسنده

  • Dennis Taylor
چکیده

No new genes? At the University of Toronto, Brendan Frey is leading a group of scientists who are using AI techniques to analyze molecularbiology data. One of their projects involves using a factor graph they developed called GenRate to discover and evaluate genes in mouse tissues. Factor graphs let researchers describe a system with complex variables, such as gene location in DNA as well as gene length and function. “What a factor graph is useful for,” says Frey, “is describing a scoring function that tells you how good each setting of the variables is.” Using samples from over 1 million probes along DNA in 37 different mouse tissues, the scientists used their factor graph to determine which bits of DNA are expressed, or activated to read protein. In some tissues, the DNA is expressed; in others, it might not be. DNA parts that have no function are never activated. In the factor graph, each variable is a node. The scoring function comprises many local scoring functions that look for a small number of variables. For that small set of variables, it finds a score for each configuration of those variables. The local scores’ sum is the total score. “It’s a nice way to decompose a very complex problem into a whole bunch of simpler problems,” Frey says. The scientists then compare the factor graph data to known gene patterns. Because the factor graph provides a computational framework for vetting the best configuration of variables as well as discovering them, the team came up with surprising results that led to a major revision of the view of the mammalian genome. Although some research claims many genes are left to discover, Frey’s team has shown that might not be true. “Beyond the genes we found,” Frey says, “we don’t believe there exists many new protein-coding genes.”

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Compression Method for Compressed Matching

A practical adaptive compression algorithm based on LZSS is presented, which is especially constructed to solve the compressed pattern matching problem, i.e., pattern matching directly in a compressed text without decompressing.

متن کامل

Two-phase Pattern Matching for Regular Expressions in Intrusion Detection Systems

Regular expressions are used to describe security threats’ signatures in network intrusion detection (NID) systems. To identify suspicious packets using regular expression matching, many NID systems use memory-based deterministic finite-state automata (DFA) with one-pass-scanning model, which is fast and allows dynamic updates. However, a number of practical signature patterns commonly found in...

متن کامل

EPSRC Vacation Bursary A Practical Investigation Into Modern Pattern Matching Techniques

Over recent years, there have been many theoretical advances in approximate pattern matching. The aim of this project has been to consider how these advances perform in practice, with the general aim of comparing the methods against a näıve approach in order to determine at what input sizes they become practical. Approximate pattern matching considers searching areas of a text string for areas ...

متن کامل

Discovering Most Classificatory Patterns for Very Expressive Pattern Classes

The classificatory power of a pattern is measured by how well it separates two given sets of strings. This paper gives practical algorithms to find the fixed/variable-length-don’t-care pattern (FVLDC pattern) and approximate FVLDC pattern which are most classificatory for two given string sets. We also present algorithms to discover the best window-accumulated FVLDC pattern and window-accumulat...

متن کامل

Algebraic Pattern Matching in Join Calculus

We propose an extension of the join calculus with pattern matching on algebraic data types. Our initial motivation is twofold: to provide an intuitive semantics of the interaction between concurrency and pattern matching; to define a practical compilation scheme from extended join definitions into ordinary ones plus ML pattern matching. To assess the correctness of our compilation scheme, we de...

متن کامل

ar X iv : 0 80 2 . 40 18 v 1 [ cs . P L ] 2 7 Fe b 20 08 ALGEBRAIC PATTERN MATCHING IN JOIN CALCULUS

We propose an extension of the join calculus with pattern matching on algebraic data types. Our initial motivation is twofold: to provide an intuitive semantics of the interaction between concurrency and pattern matching; to define a practical compilation scheme from extended join definitions into ordinary ones plus ML pattern matching. To assess the correctness of our compilation scheme, we de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006